AITopics | additional instruction

Collaborating Authors

additional instruction

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

be2e1b68b44f2419e19f6c35a1b8cf35-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsFeb-17-2026, 21:02:51 GMT

data mining, large language model, machine learning, (27 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.27)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Russia (0.14)
(92 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
(2 more...)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Information Technology > Security & Privacy (1.00)
(10 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Information Management (1.00)
Information Technology > Human Computer Interaction (1.00)
(8 more...)

Add feedback

The PRISM Alignment Dataset: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models Hannah Rose Kirk 1

Neural Information Processing SystemsOct-10-2025, 15:19:22 GMT

Human feedback is central to the alignment of Large Language Models (LLMs).

huggingface api open ac cess, individualised human feedback reveal, subjective and multicultural alignment, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.27)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Russia (0.14)
(92 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
(2 more...)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Information Technology > Security & Privacy (1.00)
(10 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models

Xu, Chejian, Zhang, Jiawei, Chen, Zhaorun, Xie, Chulin, Kang, Mintong, Potter, Yujin, Wang, Zhun, Yuan, Zhuowen, Xiong, Alexander, Xiong, Zidi, Zhang, Chenhui, Yuan, Lingzhi, Zeng, Yi, Xu, Peiyang, Guo, Chengquan, Zhou, Andy, Tan, Jeffrey Ziwei, Zhao, Xuandong, Pinto, Francesco, Xiang, Zhen, Gai, Yu, Lin, Zinan, Hendrycks, Dan, Li, Bo, Song, Dawn

arXiv.org Artificial IntelligenceMar-18-2025

Multimodal foundation models (MMFMs) play a crucial role in various applications, including autonomous driving, healthcare, and virtual assistants. However, several studies have revealed vulnerabilities in these models, such as generating unsafe content by text-to-image models. Existing benchmarks on multimodal models either predominantly assess the helpfulness of these models, or only focus on limited perspectives such as fairness and privacy. In this paper, we present the first unified platform, MMDT (Multimodal DecodingTrust), designed to provide a comprehensive safety and trustworthiness evaluation for MMFMs. Our platform assesses models from multiple perspectives, including safety, hallucination, fairness/bias, privacy, adversarial robustness, and out-of-distribution (OOD) generalization. We have designed various evaluation scenarios and red teaming algorithms under different tasks for each perspective to generate challenging data, forming a high-quality benchmark. We evaluate a range of multimodal models using MMDT, and our findings reveal a series of vulnerabilities and areas for improvement across these perspectives. This work introduces the first comprehensive and unique safety and trustworthiness evaluation platform for MMFMs, paving the way for developing safer and more reliable MMFMs and systems. Our platform and benchmark are available at https://mmdecodingtrust.github.io/.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2503.14827

Country:

Europe > Switzerland > Zürich > Zürich (0.13)
North America > United States > Illinois > Cook County > Chicago (0.04)
Oceania (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Ground > Road (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(4 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(5 more...)

Add feedback

FATH: Authentication-based Test-time Defense against Indirect Prompt Injection Attacks

Wang, Jiongxiao, Wu, Fangzhou, Li, Wendi, Pan, Jinsheng, Suh, Edward, Mao, Z. Morley, Chen, Muhao, Xiao, Chaowei

arXiv.org Artificial IntelligenceNov-25-2024

Large language models (LLMs) have been widely deployed as the backbone with additional tools and text information for real-world applications. However, integrating external information into LLM-integrated applications raises significant security concerns. Among these, prompt injection attacks are particularly threatening, where malicious instructions injected in the external text information can exploit LLMs to generate answers as the attackers desire. While both training-time and test-time defense methods have been developed to mitigate such attacks, the unaffordable training costs associated with training-time methods and the limited effectiveness of existing test-time methods make them impractical. This paper introduces a novel test-time defense strategy, named Formatting AuThentication with Hash-based tags (FATH). Unlike existing approaches that prevent LLMs from answering additional instructions in external text, our method implements an authentication system, requiring LLMs to answer all received instructions with a security policy and selectively filter out responses to user instructions as the final output. To achieve this, we utilize hash-based authentication tags to label each response, facilitating accurate identification of responses according to the user's instructions and improving the robustness against adaptive attacks. Comprehensive experiments demonstrate that our defense method can effectively defend against indirect prompt injection attacks, achieving state-of-the-art performance under Llama3 and GPT3.5 models across various attack methods. Our code is released at: https://github.com/Jayfeather1024/FATH

information, instruction, text information, (11 more...)

arXiv.org Artificial Intelligence

2410.21492

Country:

Asia > South Korea (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
North America > United States > California (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Follow-Up Questions Improve Documents Generated by Large Language Models

Tix, Bernadette J

arXiv.org Artificial IntelligenceJun-27-2024

This study investigates the impact of Large Language Models (LLMs) generating follow-up questions in response to user requests for short (1-page) text documents. Users provided prompts requesting documents they would like the AI to produce. The AI then generated questions to clarify the user's needs before generating the requested documents. Users answered the questions and then indicated their preference between a document generated using both the initial prompt and the questions and answers, and a document generated using only the initial prompt, and gave feedback about their experience with the question-answering process.

follow-up question, llm, qa document, (17 more...)

arXiv.org Artificial Intelligence

2407.12017

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Switzerland (0.04)
Asia > India (0.04)

Genre:

Research Report > New Finding (0.47)
Research Report > Experimental Study (0.46)

Industry: Education > Educational Setting (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)

Add feedback

Silico-centric Theory of Mind

Mukherjee, Anirban, Chang, Hannah Hanwen

arXiv.org Artificial IntelligenceMar-14-2024

Theory of Mind (ToM) refers to the ability to attribute mental states, such as beliefs, desires, intentions, and knowledge, to oneself and others, and to understand that these mental states can differ from one's own and from reality. We investigate ToM in environments with multiple, distinct, independent AI agents, each possessing unique internal states, information, and objectives. Inspired by human false-belief experiments, we present an AI ('focal AI') with a scenario where its clone undergoes a human-centric ToM assessment. We prompt the focal AI to assess whether its clone would benefit from additional instructions. Concurrently, we give its clones the ToM assessment, both with and without the instructions, thereby engaging the focal AI in higher-order counterfactual reasoning akin to human mentalizing--with respect to humans in one test and to other AI in another. We uncover a discrepancy: Contemporary AI demonstrates near-perfect accuracy on human-centric ToM assessments. Since information embedded in one AI is identically embedded in its clone, additional instructions are redundant. Yet, we observe AI crafting elaborate instructions for their clones, erroneously anticipating a need for assistance. An independent referee AI agrees with these unsupported expectations. Neither the focal AI nor the referee demonstrates ToM in our 'silico-centric' test.

clone, information, instruction, (17 more...)

arXiv.org Artificial Intelligence

2403.09289

Country:

Asia > Singapore (0.04)
North America > United States > New York > Tompkins County > Ithaca (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

AmbigNLG: Addressing Task Ambiguity in Instruction for NLG

Niwa, Ayana, Iso, Hayate

arXiv.org Artificial IntelligenceFeb-27-2024

In this study, we introduce AmbigNLG, a new task designed to tackle the challenge of task ambiguity in instructions for Natural Language Generation (NLG) tasks. Despite the impressive capabilities of Large Language Models (LLMs) in understanding and executing a wide range of tasks through natural language interaction, their performance is significantly hindered by the ambiguity present in real-world instructions. To address this, AmbigNLG seeks to identify and mitigate such ambiguities, aiming to refine instructions to match user expectations better. We introduce a dataset, AmbigSNI-NLG, consisting of 2,500 instances, and develop an ambiguity taxonomy for categorizing and annotating instruction ambiguities. Our approach demonstrates substantial improvements in text generation quality, highlighting the critical role of clear and specific instructions in enhancing LLM performance in NLG tasks.

additional instruction, ambiguity, instruction, (14 more...)

arXiv.org Artificial Intelligence

2402.17717

Country:

Asia > Singapore (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(12 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback